Source of Acquisition 
NASA Ames Research Center 


Assume-Guarantee Testing 


Colin Blundell 

Dept, of Comp, and Inf. Science 
University of Pennsylvania 
Philadelphia, PA 19104, USA 

blundell@cis. upenn . edu 


Dimitra Giannakopoulou 

RIACS/NASA Ames 
NASA Ames Research Center 
Moffett Field, CA 94035-1000, USA 

dimitra@email.arc.nasa.gov 


Corina S. Pasareanu 

QSS/NASA Ames 
NASA Ames Research Center 
Moffett Field, CA 94035-1000, USA 

pcorina@emaiI. arc.nasa.gov 


ABSTRACT 

Verification techniques for component-based systems should 
ideally be able to predict properties of the assembled system 
through analysis of individual components before assembly. 
This work introduces such a modular technique in the con- 
text of testing. Assume-guarantee testing relies on the (au- 
tomated) decomposition of key system-level requirements 
into local component requirements at design time. Develop- 
ers can verify the local requirements by checking components 
in isolation; failed checks may indicate violations of system 
requirements, while valid traces from different components 
compose via the assume-guarantee proof rule to potentially 
provide system coverage. These local requirements also form 
the foundation of a technique for efficient predictive testing 
of assembled systems: given a correct system run, this tech- 
nique can predict violations by alternative system runs with- 
out constructing those runs. We discuss the application of 
our approach to testing a multi-threaded NASA application, 
where we treat threads as components. 
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1. INTRODUCTION 

As software systems continue to grow in size and complexity, 
it is becoming common for developers to assemble them from 
new or reused components potentially developed by different 
parties. For these systems, it is important to have verifica- 
tion techniques that are modular as well, since verification 
is often the dominant software production cost. Developers 
could use such techniques to avoid expensive verification of 
assembled systems, instead performing verification primar- 
ily on individual components. Unfortunately, the task of ex- 
tracting useful results from verification of components in iso- 
lation is often difficult: first, developing environments that 
will appropriately exercise individual components is chal- 
lenging and time-consuming, and second, inferring system 


properties from results of local verification is typically non- 
trivial. The growing popularity of component-based systems 
makes it important for verification researchers to investigate 
these challenges. 

Assume-guarantee reasoning is a technique that has long 
held promise for modular verification. This technique is 
a “divide-and-conquer” approach that infers global system 
properties by checking individual components in isolation [4, 
13, 15, 17]. In its simplest form, it checks whether a compo- 
nent M guarantees a property P when it is part of a system 
that satisfies an assumption A, and checks that the remain- 
ing components in the system (M’s environment) satisfy A. 
Extensions that use an assumption for each component in 
the system also exist. Our previous work developed tech- 
niques that automatically generate assumptions for perform- 
ing assume-guarantee model checking at the design level [2, 
5, 9], ameliorating the often difficult challenge of finding an 
appropriate assumption. 

While design verification is important, it is also necessary 
to verify that an implementation preserves the design’s cor- 
rectness. For this purpose, we have also previously devel- 
oped a methodology that uses the assumptions created at 
the design level to model check source code in an assume- 
guarantee style [10]; with this methodology, it is possible 
to verify source code one component at a time. Hence, 
this technique has the potential to meet the challenges of 
component- based verification. 

Unfortunately, despite the increased scalability that one can 
achieve by using assume-guarantee techniques in model check- 
ing, it remains a difficult task in the hands of experts to make 
the technique scale to the size of industrial systems. Fur- 
thermore, model checkers do not exist for many languages 
commonly used in industry. This work explores the bene- 
fits of assume-guarantee reasoning for testing, which is still 
the predominant industrial verification technique. We have 
developed assume-guarantee testing , which reuses proper- 
ties, assumptions, and proof rules from design-level assume- 
guarantee verification to enhance both unit testing and whole- 
system testing. The contributions of assume-guarantee test- 
ing are as follows: 

1. During unit testing, assume-guarantee testing has the 
potential to obtain system coverage and detect system-level 
errors. Our approach applies assume-guarantee reasoning 
to component test traces, using assumptions as environ- 


ments to drive individual components. This process provides 
guarantees on trace compositions that are analogous to the 
guarantees obtained by assume-guarantee model checking. 
Hence, the technique can infer that a (potentially large) set 
of system traces satisfies a global property by checking traces 
of components in isolation against assume-guarantee pairs. 
Moreover, component test traces that fail their assume-guarantee 
premises may uncover system-level violations. Assumptions 
restrict the context of the components, thus reducing the 
number of false positives obtained by verification (i.e., er- 
rors that will never exhibit themselves in the context of 
the particular system in which the component will be in- 
troduced). As a result, the likelihood that a failed local 
check corresponds to a system- level error is higher. Early 
error detection is desirable, as it is well established that er- 
rors discovered earlier in the development phase are easier 
and cheaper to fix. 

2. During whole-system testing, assume-guarantee testing 
has the potential to efficiently detect bugs and provide cov- 
erage. In this context, our approach projects system traces 
onto individual components, and applies assume-guarantee 
reasoning to the projections. This technique is an efficient 
means of predictive testing. Predictive testing detects the 
existence of bad traces from good traces [19]. It exploits 
the insight that one can reorder independent events from a 
trace to obtain different legal traces. Typically, predictive 
testing techniques discover these alternative traces by com- 
posing independent events in different orders. Our technique 
uses assume-guarantee reasoning to obtain results about the 
alternative interleavings without explicitly exploring them, 
and thus is potentially more efficient. 

We experimented with our assume-guarantee testing frame- 
work in the context of the Eagle runtime analysis tool [3], 
and applied our approach to a NASA software system also 
used in the demonstration of our design-level assume-guarantee 
reasoning techniques. In the analysis of a specific property 
(P) during these experiments, we found a discrepancy be- 
tween one of the components and the design that it im- 
plements. This discrepancy does not cause the system to 
violate P; monolithic model checking would therefore not 
have detected it. 

The remainder of the paper is organized as follows. We 
first provide some background in Section 2, followed by a 
discussion of our assume-guarantee testing approach and its 
advantages in Section 3. Section 4 describes the experience 
and results obtained by the application of our techniques to 
a NASA system. Finally, Section 5 presents related work 
and Section 6 concludes the paper. 

2 . BACKGROUND 

LTSs. At design level, this work uses Labeled Transition 
Systems (LTSs) to model the behavior of communicating 
components. Let Act be the universal set of observable ac- 
tions and let r denote a local action unobservable to a com- 
ponent’s environment. An LTS M is a quadruple {Q y aM y 5, qO) 
where: 

• Q is a non-empty finite set of states 

• aM C Act is a finite set of observable actions called 


the alphabet of M 

• 8 C Q x aM U {r} x Q is a transition relation 

• qO € Q is the initial state 

Let M = (Q,aM,5,qO) and M' = (Q\ aM' y S', qO'). We say 
that M transits into M' with action a, denoted M M', 
if and only if (qO y a, qO ') € 8 and aM = aM' and 8 — 8' . 

Traces. A trace t of an LTS M is a sequence of observable 
actions that M can perform starting at its initial state. For 
£ C Act, we use tf£ to denote the trace obtained by remov- 
ing from t all occurrences of actions a £ £. The set of all 
traces of M is called the language of M, denoted £(M). 

Let t = (ai, 02 , . . . ,a n ) be a finite trace of some LTS M. 

We use [t] to denote the LTS M t = { Q , aM y 8 y qO } with Q — 
{qo,Qu • • .,9n}, and 8 = {(<?i_i, a iy <?*)}, where 1 < i < n. 

Parallel Composition. The parallel composition oper- 
ator || is commutative and associative. It combines the 
behavior of two components by synchronizing the actions 
common to their alphabets and interleaving the remaining 
actions. Formally, let Mi = {Qi, aMi, <£, qOi) and M2 = 

(Q2, aM 2 , £2, 9O2) be two LTSs. Then M\ || M2 is an LTS 
M = (Q,aM, 8, qO), where Q — Q\ x Q 2 , qO = (gO;,qO 2), 
a M = aMi U aM? , and <5 is defined as follows, where a is 
either an observable action or r (note that commutativity 
implies the symmetric rules): 

Mi Ml a <£ aM 2 
Mi || M 2 M[ || M 2 

Mi -T, M ' 7 M 2 a / r 

Mi || M 2 -£-> Mi || M 2 

Properties and Satisfiability. A property is specified as 
an LTS P, whose language C (P) defines the set of accept- 
able behaviors over aP. An LTS M satisfies P, denoted as 
M f= P, if and only if Vf G C {M).t\aP G £ (P). 

Assume-guarantee Triples. In the assume-guarantee paradigm 
a formula is a triple (A) M (P), where M is a component, 

P is a property and A is an assumption about M's environ- 
ment [17]. The formula is true if whenever M is part of a 
system satisfying A, then the system guarantees P. At de- 
sign level in our framework, the user expresses all of A, M, P 
as LTSs. 

Assume-guarantee Reasoning. Consider for simplicity a 
system that is made up of components Mi and M2. The aim 
of assume-guarantee reasoning is to establish Mi || M 2 [= 

P without composing Mi with M 2 . For this purpose, the 
simplest proof rule consists of showing that the following 
two premises hold: (A) Mi (P) and {true) M 2 (A). From 
these, the rule infers that {true) Mi || M 2 (P) also holds. 

Note that for this rule to be useful, the assumption must 
be more abstract than M 2 , but still reflect M 2 ’s behavior. 
Additionally, an appropriate assumption for the rule needs 
to be strong enough for Mi to satisfy P. Unfortunately, it 
is often difficult to find such an assumption. 



Our previous work developed frameworks that compute as- 
sumptions automatically for finite-state models and safety 
properties expressed as LTSs. More specifically, Giannakopoulou 
et al. [9] present an approach to synthesizing the assump- 
tion that a component needs to make about its environment 
for a given property to hold. The assumption produced is 
the weakest , that is, it restricts the environment no more 
and’ no less than is necessary for the component to satisfy 
the property. Barringer et al [2] and Cobleigh et al. [5] 
use a learning algorithm to compute assumptions in an in- 
cremental fashion in the context of simple and symmetric 
assume-guarantee rules, respectively. 

3. ASSUME-GUARANTEE TESTING 

This section describes our methodology for using the arti- 
facts of the design-level analysis, i. e. models, properties and 
generated assumptions, for testing the implementation of a 
software system. This work assumes a top-down software 
development process, where one creates and debugs design 
models and then uses these models to guide the development 
of source code, possibly by (semi-) automatic code synthesis. 

Our approach is illustrated by Figure 1. Consider a system 
that consists of two (finite-state) design models Mi and M2 , 
and a global safety property P. Assume that the property 
holds at the design level (if the property does not hold, de- 
velopers can use the feedback provided by the verification 
framework to correct the models). The assume-guarantee 
framework that is used to check that the property holds will 
also generate an assumption A that is strong enough for Mi 
to satisfy P but weak enough to be discharged by M2 (ie. 

(A) Mi { P ) and {true) M2 {A) both hold), as described in 
Section 2. 

Once design-level verification establishes the property, it is 
necessary to verify that the property holds at the implemen- 
tation level, i.e. that C\ || C2 \= P. This work assumes that 
each component implements one of the design models, e.g. 
components Ci and C2 implement Mi and M2, respectively, 
in Figure 1. We propose assume-guarantee testing as a way 
of checking Ci || C2 (= P. This consists of producing test 
traces by each of the two components, and checking these 
traces against the respective assume-guarantee premises ap- 
plied at the design level. If each of the checks succeeds, then 
the proof rule guarantees that the composition of the traces 
satisfies the property P. 

We illustrate assume-guarantee testing through a simple ex- 
ample. Consider a communication channel that has two 
components, designs Mi and M2 and corresponding code 
Ci and C2 (see Figure 2). Property P describes all legal 
executions of the channel in terms of events {in, out}; it 
essentially states that for a trace to be legal, vn must oc- 
cur in the trace before any occurrence of out. Figure 2 
also shows the assumption A that design-level analysis of 
M\ || M2 generates (see Section 2). Note that although 
M\ j| M2 (= P, Ci || C2 does not. Testing Ci and C2 
in isolation may produce the traces t\ and t2 (respectively) 
that Figure 3 (left) shows. Checking {true) £2 {A) during 
assume-guarantee testing will detect the fact that £2 violates 
the assumption A and will thus uncover the problem with 
the implementation. Assume now that the developers do not 
use assume-guarantee testing, but rather test the assembled 
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Figure 1: Design and code level analysis 


system (we call the latter monolithic testing). The system 
might first produce the first two traces illustrated in Fig- 
ure 3 (right). These traces satisfy the property, which could 
lead the developers to mistakenly believe that the system 
is correct. They may even achieve some coverage criterion 
without detecting the bug, as discussed later in this section. 

In summary, assume-guarantee testing can obtain results on 
all interleavings of two individual component traces simply 
by checking each against the appropriate assume-guarantee 
premise. In the context of our example, checking £1 and £2 
corresponds to checking all four traces illustrated in Figure 3 
(right). 

While our example illustrates the benefits of assume-guarantee 
reasoning for unit testing, similar benefits apply to testing 
of assembled systems. When the system is assembled, the 
testing framework uses assume-guarantee reasoning to con- 
duct analysis that can efficiently predict, based on correct 
system runs, violations by alternative system runs. We dis- 
cuss both flavors of assume-guarantee testing in more detail 
below. 


3.1 Assume-Guarantee Component Testing 

The first step in assume-guarantee component testing in- 
volves the implementation of 1) Ua for C\, where Ua en- 
codes Ci’s universal environment restricted by assumption 
A, and 2) the universal environment U for C2. The univer- 
sal environment for a component may exercise any service 
that the component provides in any order, and may provide 
or refuse any service that the component requires. The next 
step is to execute Ci in Ua and C 2 in U to produce sets of 
traces Ti and T2 respectively. The technique then performs 
assume-guarantee reasoning, checking each trace £1 € Ti 
against P and each trace £2 € T2 against A. If either of 
these checks fail (as in Figure 3), this is an indication that 
there is an incompatibility between the models and the im- 
plementations, which the developers can then correct. If all 
these tests succeed, then the assume-guarantee rule implies 
that [£i]||[£2] \= P , for all £1 £ 2\ and £2 € T2. 

Using this approach, one can check system correctness through 
local tests of components. It is possible to perform assume- 
guarantee testing as soon as each component becomes “code 
complete”, and before assembling an executable system or 
even implementing other components. A secondary advan- 
tage of this approach is that it ameliorates the problem of 
choosing appropriate testing environments for components 
in isolation. This is a difficult problem in general, as finding 
an environment that is both general enough to fully exercise 
the component under testing and specific enough to avoid 
many false positives is usually a time-consuming iterative 
process. Here, this problem is reduced to that of correctly 



Figure 2 : Assume-guarantee testing 


implementing U a and U. Note that alternatively, one may 
wish to check preservation of properties by checking directly 
that each implemented component refines its model. In our 
experience, for well-designed systems, the interfaces between 
components are small, and the generated assumptions are 
much smaller than the component models. Therefore, it is 
more efficient to check the assumptions than to check refine- 
ment directly. Finally, note that, when checking components 
in isolation, one has more control over the component inter- 
face (since it is exercised directly rather than through some 
other component). As a result, it is both easier to repro- 
duce problematic behavior, and to exercise more traces for 
constructing sets T\ and T fe. 
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Figure 3 : Discovering bugs with fewer tests 


Coverage. Unlike model checking, testing is not an ex- 
haustive verification technique. As a result, it is possible for 
defects to escape despite testing. For this reason, software 
quality assurance engineers and researchers on software test- 
ing have traditionally associated the notion of coverage with 
the technique. Coverage criteria dictate how much testing is 
“enough” testing. A typical coverage criterion that works on 
the structure of the code is “node” coverage, which requires 
that the tests performed cover all nodes in the control flow 
graph of a system’s implementation. Assume that in our ex- 
ample our coverage criterion is node coverage for C\ and C2 . 
Then ti and £2 in Figure 3 (left) together achieve 100 % cov- 
erage. Similarly, the first trace of the assembled system in 
Figure 3 (right) achieves 100 % node coverage. It is therefore 
obvious that assume-guarantee testing has the potential of 
checking more behaviors of the system even when it achieves 
the same amount of coverage. This example also reflects the 
fact that traditional coverage criteria are often not appro- 
priate for concurrent or component- based systems, which is 
an area of active research. One could also measure coverage 
by the number of behaviors or paths through the system 
that are exercised. The question of what benefits assume- 
guarantee reasoning can provide in such a context is open 
research. 

Discussion. As stated above, our hope is that by checking 
individual traces of components, the technique covers mul- 
tiple traces of the assembled system. Unfortunately, this is 
not always true, due to the problem of incompatible traces , 
which are traces that do not execute the same shared events 


in the same order. These traces are from different execu- 
tion paths, and thus give the empty trace on composition. 
For example, suppose that the first event in ti is a function 
call on the procedure foo in Ci, while the first event in £2 
is a function call on the procedure bar in Ci\ these traces 
executed on different paths and are incompatible. Thus, 
assume-guarantee testing faces the question of producing 
compatible traces during component testing. One potential 
way to guarantee that Ti and T2 contain compatible traces 
is to use the component models as a coverage metric when 
generating traces in T\ and T2, and require that each set of 
traces cover certain sequences of shared events in the mod- 
els. 

3.2 Predictive Analysis on Assembled Systems 

Assume-guarantee testing can also be a mechanism for pre- 
dictive testing of component assemblies. Assume-guarantee 
testing for predictive analysis has the following steps: 

• Obtain a system trace £ (by running Ci\\C2)- 

0 Project the trace on the alphabets of each component; 
obtain £1 = t\aC\ and £2 = t\aC2- 

0 Use the design-level assumption to study the compo- 
sition of the projections; i.e. check that (A) [£1] (P) 
and {true) [£2] (A) hold, using model checking. 

The approach is illustrated in Figure 4 : on the right, we 
show a trace t of C\\\C2 that does not violate the property. 
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Figure 4: Predictive analysis 

On the left, we show the projections £1 and 1 2. Note that 
(true) [£2] (A) does not hold, hence from a single “good” 
trace the methodology has been able to show that C1HC2 
violates the property. Using the design-level assumption to 
analyze the projections is more efficient than composing the 
projections and checking that the composition satisfies the 
property (as is performed by other predictive analysis tech- 
niques) as long as the assumption is small; in our experience, 
this is often the case [9]. 

An alternative approach is to generate the assumption di- 
rectly from the projected trace £i, and then test that t2 
satisfies this assumption. This approach is a way to do 
assume-guarantee predictive testing in a system where there 
are no design-level models. However, it may not be practi- 
cal to generate a new assumption for each trace; we plan to 
experiment with this approach in the future. 

Discussion. It is desirable to use assume-guarantee pre- 
dictive testing as a means of efficiently generating system 
coverage. This technique does not suffer from incompatible 
traces, as the two projected traces occur in the same sys- 
tem trace and are thus guaranteed to be compatible. How- 
ever, to gain the full benefits of assume-guarantee testing 
in this context, trace generation should take into account 
the results of predictive analysis. For example, suppose that 
trace generation produces a trace £, projected onto ti and £2. 
Assume-guarantee testing proves that [UjHfc] f= P- Further 
trace generation should avoid traces in [$i]||[£ 2] since these 
are covered by the assume-guarantee checking of £1 and £2. 
Again, one possible way to ensure avoidance of such redun- 
dant traces is to use the design-level model as a coverage 
metric; two traces that have different sequences of shared 
events through the model will project onto different traces. 
Test input generation techniques could also be useful for this 
purpose. This topic is a subject of future work. 

4. EXPERIENCE 

Our case study is the planetary rover controller K9, and 
in particular its executive subsystem, developed at NASA 
Ames Research Center. We performed this study in the con- 
text of an ongoing collaboration with the developers of the 
rover, in which we have performed verification during devel- 
opment to increase the quality of the design and implemen- 
tation of the system. Below we describe the rover executive, 
our design-level analysis, how we used the assumptions gen- 



Figure 5: The Executive of the K9 Mars Rover 


erated by this analysis to conduct assume-guarantee testing, 
and results of this testing. 

4.1 K9 Rover Executive Subsystem 

The executive sub-system commands the rover through the 
use of high-level plans , which it interprets and executes in 
the context of the execution environment. The executive 
monitors execution of plan actions and performs appropriate 
responses and cleanup when execution fails. 

The executive implementation is a multi-threaded system 
(see Figure 5), made up of a main coordinating component 
named Executive , components for monitoring temporal con- 
ditions ExecTimer Checker and state conditions ExecCond- 
Checker , and an ActionExecution thread that is responsible 
for issuing the commands (actions) to the rover. The com- 
munication between different components (threads) is made 
through an EventQueue. The implementation has 35K lines 
of C-f-b code and it uses the POSIX thread library. 

4.2 Design-level Analysis 

We previously developed detailed design models for the ex- 
ecutive subsystem [9]. We then checked these models in an 
assume-guarantee manner for several properties specified by 
the developers. Model checking of the design models uncov- 
ered a number of synchronization problems such as dead- 
locks and data races, which we then fixed in collaboration 
with the developers. After finishing this process, for each 
property we had an assumption on one of the components 
stating what behavior was needed of it for the property to 
hold of the entire system. 

4.3 Assume-guarantee Testing Framework 

We have developed a framework that uses the assumptions 
and properties built during the uesign-ievel analysis for the 
assume-guarantee testing of the executive implementation. 
In order to apply assume-guarantee testing, we broke up the 
implementation into two components, with the Executive 
thread, the EventQueue and the ActionExecution thread on 
one side (Mi), and the BxecCondChecker thread and the 
other threads on the other side (M2), as shown in Figure 5. 

To test the components in isolation, we generated envi- 
ronments that encode the design-level assumptions (as de- 
scribed in Section 3). We implemented each environment as 







Figure 6: Testing Environment 


a thread running a state machine (the respective design- level 
assumption) that executes in an infinite loop. In each itera- 
tion of the loop, the environment makes a random choice to 
perform an “active” event (such as calling a component func- 
tion) that is enabled in the current state; the state machine 
then makes the appropriate transition. To make function 
calls on the component, we provided dummy values of irrel- 
evant arguments (while ensuring that these dummy values 
did not cause any loss of relevant information). The envi- 
ronment implementations also provide stubs for the exter- 
nal functions that the component under testing calls; when 
called, these functions cause state machine transitions. 

The methodology uses the Eagle run-time monitoring frame- 
work [3] to check that the components conform with the 
assume- guarantee pairs. Eagle is an advanced testing frame- 
work that provides means for constructing test oracles that 
examine the internal computational status of the analyzed 
system. For run-time monitoring, the user instruments the 
program to emit events that provide a trace of the run- 
ning system. Eagle then checks to see whether the current 
trace conforms to formalized requirements, stated as tem- 
poral logic assertions or finite-state automata. 

For our experiments, we instrumented (by hand) the code 
of the executive components to emit events that appear in 
the design- level assumptions and properties. We also (auto- 
matically) translated these assumptions and properties into 
Eagle monitors. 

Note that in order to run the executive system (or its com- 
ponents), the user needs to provide an input plan and an 
environment simulating the actual rover hardware. For our 
assume- guarantee testing experiments, the hardware envi- 
ronment was stubbed out. For plan input generation, we 
built upon our previous work, which combines model check- 
ing and symbolic execution for specification-based test input 
generation [21]. To generate test input plans, we encoded 
the plan language grammar as a nondeterministic specifi- 
cation. Running model checking on this model generates 
hundreds of input plans in a few seconds. 

We have integrated the above techniques to perform assume- 
guarantee testing on the executive (see Figure 6). We first 
instrument the code and generate Eagle monitors encod- 
ing design- level assumptions and properties. The framework 
generates a set of test input plans, a script runs the exec- 
utive on each plan and it calls Eagle to monitor the gen- 


erated run-time traces. The user can choose to perform a 
whole-program (monolithic) analysis or to perform assume- 
guarantee reasoning. 

4.4 Results 

r We ran several experiments (according to different input 
plans). For one property, we found a discrepancy between 
the implementation and the models. The property ( P ) states 
that the ExecCondChecker should not push events onto the 
EventQueue unless the Executive has sent the ExecCond- 
Checker conditions to check. The design-level assumption 
(A) on the ExecCondChecker states that the property will 
hold as long as the ExecCondChecker sets a flag variable to 1 
before pushing events, since these assignments only happen 
in response to the Executive sending conditions. 

To check this property, we generated an environment that 
drives component C\ (which contains the Executive ) accord- 
ing to assumption A. We instrumented C\ to emit relevant 
events and we ran Eagle to check if the generated traces 
conform to property P. 

We also generated a universal environment for component 
C 2 (which contains the ExecCondChecker ); we instrumented 
C 2 to emit events and we used Eagle to check if the gener- 
ated traces conform to A. In fact, component C 2 did not 
conform with the assumption. The obtained counterexam- 
ple traces exhibit a scenario where the ExecCondChecker 
pushes events onto the EventQueue without first setting the 
flag variable to 1. This turned out to be due to the fact 
that an input plan can contain null conditions. Instead of 
putting these in the condition list for monitoring, the Ex- 
ecCondChecker immediately pushes an event to the queue. 
This behavior exposed an inconsistency between the mod- 
els and the implementation, which we corrected. Monolithic 
model checking of the property P would not have uncovered 
this inconsistency. 

5. RELATED WORK 

Assume- guarantee reasoning leverages the observation that 
verification techniques can analyze the individual compo- 
nents of large systems in isolation to improve performance. 
Formal techniques and tools for support of component- based 
design and verification are gaining prominence; see for ex- 
ample [1, 6, 8]. All these approaches use some form of envi- 
ronment assumptions (either implicit or explicit), to reason 
about components in isolation. 





Our previous work [10] presented a technique for using design- 
level assumptions for compositional analysis of source code. 
That work used model checking (Java PathFinder [20]), while 
the focus here is on testing. Dingel [7] also uses model check- 
ing (the VeriSoft state- less model checker [11]) for perform- 
ing assume-guarantee verification for C/C++ components. 
However, the burden of generating assumptions is on the 
user. 

Our work is also related to specification-based testing, a 
widely- researched topic. For example, Jagadeesan et al. [14] 
and Raymond et al. [18] use formal specifications for the 
generation of test inputs and oracles. These works generate 
test inputs from constraints (or assumptions) on the envi- 
ronment of a software component and test oracles from the 
guarantees of the component under test. The AsmLT Test 
Generator [12] translates Abstract State Machine Language 
(AsmL) specifications into finite state machines (FSMs) and 
different traversals of the FSMs are used to construct test 
inputs. We plan to investigate the use of different traver- 
sal techniques for test input generation from assumptions 
and properties (which are in essence FSMs). None of the 
above-described approaches address predictive analysis. 

Sen et al. [19] have also explored predictive runtime anal- 
ysis of multithreaded programs. Their work uses a partial 
ordering on events to extract alternative interleavings that 
are consistent with the observed interleaving; states from 
these interleavings form a lattice that is similar to our com- 
position of projected traces. However, to verify that no bad 
state exists in this lattice, they construct the lattice level by 
level, while this work proposes using assume-guarantee rea- 
soning to give similar guarantees without having to explore 
the composition of the projected traces. 

Levy et al. [16] use assume-guarantee reasoning in the con- 
text of runtime monitoring. Unlike our work, which aims at 
improving testing, the goal of their work is to combine mon- 
itoring for diverse features, such as memory management, 
security and temporal properties, in a reliable way. 

6. CONCLUSIONS AND FUTURE WORK 

We have presented assume-guarantee testing, an approach 
that improves traditional testing of component- based sys- 
tems by predicting violations of system-level requirements 
both during testing of individual components and during 
system-level testing. During unit testing, our approach uses 
design-level assumptions as environments for individual com- 
ponents and checks generated traces against premises of an 
assume-guarantee proof rule; the assumptions restrict the 
context in which the components operate, making it more 
likely that failed checks correspond to system-level errors. 
During testing of component assemblies, the technique uses 
assume-guarantee reasoning on component projections of a 
system trace, providing results on alternative system traces. 
We have experimented with our approach in the verification 
of a non-trivial NASA system and report promising results. 

Although we have strong reasons to expect that this tech- 
nique can significantly improve the state of the art in test- 
ing, quantifying its benefits is a difficult task. One reason 
is the lack of appropriate coverage criteria for concurrent 
and component- based systems. Our plans for future work 


Include coming up with “component-based” testing cover- 
age criteria, i.e. criteria which, given the decomposition of 
global system properties into component properties, deter- 
mine when individual components have been tested enough 
to guarantee correctness of their assembly. One interest- 
ing avenue for future research in this area is the use of the 
models as a coverage metric. 
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